Easy2Siksha.com
GNDU Question Paper-2024
B.A 5
th
Semester
QUANTITATIVE TECHNIQUES
(Quantitative Techniques-V)
Time Allowed: Three Hours Max. Marks: 100
Note: Attempt Five questions in all, selecting at least One question from each section. The
Fifth question may be attempted from any section. All questions carry equal marks.
SECTION-A
1. Explain the following terms:
(i) One tailed and two tailed tests
(ii) Critical Region
(iii) Type 1 and Type II Errors
2. (a) What do you mean by Maximum Likelihood Estimation (MLE) method? In what
sense it is different from Ordinary Least Squares (OLS) method? Discuss in detail.
(b) Distinguish between an estimate and an estimator. Also discuss the properties of a
good estimator.
SECTION-B
3. What do you understand by x² (Chi-square) distribution ? Derive its main properties.
4. Define student's t-statistic and derive its chief properties.
Easy2Siksha.com
SECTION-C
5. (a) In a sample of 500 persons, 280 are found to be rice eaters and the rest are wheat
eaters. Can we assume that both the articles are equally popular?
(b) Discuss the main applications of Z distribution.
6. In a laboratory experiment, two samples gave the following results:
Sample
Size
Simple Mean
Sum of squares of
deviations from
mean
1
10
15
90
2
12
14
108
Test the equality of sample variances at 5% level of significance.
SECTION-D
7. A trucking company wishes to test the average life of each of the four brands of tyres.
The company uses all brands on randomly selected trucks. The records showing the lives
(thousands of miles) is given in the following table:
Brand 1
Brand 2
Brand 3
Brand 4
20
19
21
15
23
15
19
17
18
17
20
16
17
20
17
18
16
16
Easy2Siksha.com
Test the hypothesis that average life of each brand of tyres is same.
8. What is analysis of variance technique? Discuss its main assumptions. Also distinguish
between one way and two way ANOVA techniques
GNDU Answer Paper-2024
B.A 5
th
Semester
QUANTITATIVE TECHNIQUES
(Quantitative Techniques-V)
Time Allowed: Three Hours Max. Marks: 100
Note: Attempt Five questions in all, selecting at least One question from each section. The
Fifth question may be attempted from any section. All questions carry equal marks.
SECTION-A
1. Explain the following terms:
(i) One tailed and two tailed tests
(ii) Critical Region
(iii) Type 1 and Type II Errors
Ans: Understanding Key Concepts in Hypothesis Testing: A Simple Story
Imagine you are a detective trying to solve a mystery. In front of you is a locked box, and
you have a suspicion about what’s inside. But you can’t open it immediately; you have to
rely on clues, observations, and evidence. This is exactly what scientists and statisticians do
when they test hypotheses. They cannot always directly "see" the truth, but they use data,
rules, and logic to make decisions. In this story, we will explore four important concepts in
hypothesis testing: one-tailed and two-tailed tests, critical region, and Type I and Type II
errors. By the end, these terms will feel as intuitive as solving a puzzle.
Easy2Siksha.com
(i) One-tailed and Two-tailed Tests: Choosing Where to Look
Let’s continue with our detective analogy. Suppose you are investigating whether a new
medicine increases the recovery rate of patients. Your hypothesis is that this medicine helps
more than the standard treatment. Naturally, you are only interested if it increases
recoverynot if it decreases it. Here, you focus your attention in one directionupwards,
looking for improvement. This is the essence of a one-tailed test.
A one-tailed test examines whether a parameter (like the mean recovery rate) is either
greater than or less than a specific value, but not both. The “tail” refers to the extreme end
of the probability distribution. If your observed data falls in this tail, you have enough
evidence to reject the null hypothesis (the default assumption that the medicine has no
effect).
Now, imagine another scenario. You don’t know whether the medicine helps or harms
patients; you only want to see if it makes a difference, either positive or negative. Here, you
must look in both directionswhether recovery improves or worsens. This is called a two-
tailed test. In a two-tailed test, the critical region is split between the two ends (tails) of the
probability distribution. Only if your data falls in either tail do you reject the null hypothesis.
To summarize:
One-tailed test: Focuses on a specific direction (increase or decrease).
Two-tailed test: Considers both directions (any difference).
A simple way to remember this is: one-tailed is like a sniper rifle, precise and aimed in one
direction; two-tailed is like a wide-angle camera, looking both ways to capture any change.
(ii) Critical Region: The Danger Zone
Next, think about the crime scene again. Imagine there’s a specific area where, if you find
evidence, you can confidently say, “Aha! This proves the suspect’s involvement.” In
statistics, this area is called the critical region.
The critical region is the set of values of a test statistic that leads to rejection of the null
hypothesis. It represents the extreme outcomes that are unlikely to occur if the null
hypothesis is true.
For example, consider a normal distribution of test scores for students. If the average score
under the null hypothesis is 50, and your significance level (α) is 5%, the critical region could
be:
For a one-tailed test (looking for higher scores), scores above 60 might fall in the
critical region.
For a two-tailed test (looking for any deviation), scores below 40 or above 60 might
fall in the critical region.
Easy2Siksha.com
If your observed data lands in this “danger zone,” it signals that something unusual has
happened, prompting you to reject the null hypothesis.
So, the critical region is essentially your statistical red flaga clear signal that the evidence
is strong enough to take action.
(iii) Type I and Type II Errors: When Detectives Make Mistakes
Even the best detective can make mistakes. In hypothesis testing, errors happen because
decisions are based on samples rather than the entire population. There are two types of
mistakes: Type I error and Type II error.
Type I Error: False Alarm
Imagine you are investigating a crime, and you see a shadow and immediately accuse
someone, only to find later that it was a harmless passerby. In statistics, this is a Type I
error. It occurs when you reject the null hypothesis even though it is actually true.
Using our medicine example: if the medicine actually has no effect, but your test concludes
that it works, you have made a Type I error. This is why we set a significance level (α), often
0.05 or 5%. This level represents the probability of committing a Type I error. In other
words, you are willing to accept a 5% chance of a false alarm.
Type II Error: Missed Opportunity
Now imagine the reverse scenario. A suspect is indeed guilty, but you fail to catch them
because the evidence was subtle. In statistics, this is a Type II error. It occurs when you fail
to reject the null hypothesis even though it is false.
In the medicine case, if the medicine actually helps patients, but your test concludes it
doesn’t, that is a Type II error. The probability of this error is denoted by β. A high β means
you are more likely to miss true effects, so researchers aim to minimize it while balancing
Type I error.
To visualize:
Type I error (α): Saying “Yes, it works!” when it doesn’t. A false positive.
Type II error (β): Saying “No, it doesn’t work!” when it actually does. A false
negative.
Finding the balance between these errors is like walking a tightrope. Lowering α (making the
test stricter) reduces false alarms but increases the chance of missing real effects (Type II
error). Raising α makes it easier to detect true effects but risks more false alarms.
Easy2Siksha.com
Connecting the Concepts
All these conceptsone-tailed/two-tailed tests, critical region, and Type I/II errorsare
connected like pieces of a detective toolkit.
1. Decide the tail: Are you looking in one direction (one-tailed) or both directions (two-
tailed)?
2. Set the danger zone: Define the critical region where evidence is strong enough to
reject the null.
3. Balance risks: Understand that mistakes can happen (Type I or Type II errors) and
plan your significance level accordingly.
Imagine conducting an experiment is like planning a careful investigation. You choose your
approach, define your criteria for action, and acknowledge the possibility of mistakes. This
way, decisions are systematic, logical, and defensible.
Conclusion
To sum up, hypothesis testing is not just numbers on a pageit is a story of curiosity, logic,
and careful judgment.
One-tailed and two-tailed tests determine where to look for evidence.
Critical region marks the danger zone where you must act.
Type I and Type II errors remind us that mistakes are possible, and we must
carefully balance risks.
Understanding these concepts is like becoming a skilled detectiveobservant, cautious, and
aware of both possibilities and limitations. With this mindset, you not only perform tests
accurately but also appreciate the art and logic behind every statistical decision.
2. (a) What do you mean by Maximum Likelihood Estimation (MLE) method? In what
sense it is different from Ordinary Least Squares (OLS) method? Discuss in detail.
(b) Distinguish between an estimate and an estimator. Also discuss the properties of a
good estimator.
Ans: Imagine you are a detective trying to solve a mystery. You have a set of clues (data
points), and your job is to figure out the most likely story behind them. Sometimes you try
to minimize the “distance” between the clues and your explanation (that’s like Ordinary
Least Squares). Other times, you ask: “Given these clues, what explanation makes them
most probable?” (that’s like Maximum Likelihood Estimation). Both methods aim to
uncover the truth, but they use different strategies.
Easy2Siksha.com
Now let’s carefully unfold this topic in two parts: (a) Maximum Likelihood Estimation (MLE)
and how it differs from Ordinary Least Squares (OLS). (b) The difference between an
estimate and an estimator, along with the properties of a good estimator.
󷊆󷊇 Part (a): Maximum Likelihood Estimation (MLE) and its Difference from OLS
1. What is Maximum Likelihood Estimation (MLE)?
MLE is a method of estimating the parameters of a statistical model.
The idea is simple: choose the parameter values that make the observed data most
“likely.”
In other words, MLE asks: “If these are the data points I see, what parameter values
maximize the probability of observing them?”
Step-by-step intuition:
1. Suppose you toss a coin 10 times and get 7 heads.
2. You want to estimate the probability of heads (p).
3. The likelihood of observing 7 heads out of 10 is given by the binomial formula.
4. MLE says: choose the value of p that maximizes this likelihood.
5. In this case, p = 0.7 is the MLE.
So, MLE is like tuning your model until it gives the highest probability to the data you
actually observed.
2. What is Ordinary Least Squares (OLS)?
OLS is a method used mainly in regression analysis.
It estimates parameters by minimizing the sum of squared differences between
observed values and predicted values.
In simple linear regression, OLS finds the line that minimizes the squared vertical
distances between the data points and the line.
Example: If you are fitting a line to predict students’ marks from study hours, OLS will
choose the slope and intercept that minimize the squared errors between actual marks and
predicted marks.
3. Key Differences Between MLE and OLS
Aspect
Ordinary Least Squares (OLS)
Philosophy
Chooses parameters that minimize
the sum of squared errors.
Applicability
Mainly used for linear regression
models.
Easy2Siksha.com
Aspect
Ordinary Least Squares (OLS)
Assumptions
Assumes errors are normally
distributed with constant variance.
Output
Produces best linear unbiased
estimates under GaussMarkov
assumptions.
Flexibility
Limited to situations where
minimizing squared errors makes
sense.
Interpretation
“Which line best fits the data by
minimizing squared errors?”
4. Connection Between MLE and OLS
Interestingly, in the case of linear regression with normally distributed errors, OLS and MLE
give the same estimates. This is because minimizing squared errors is mathematically
equivalent to maximizing the likelihood under normality.
So, OLS can be seen as a special case of MLE.
󷈷󷈸󷈹󷈺󷈻󷈼 Part (b): Estimate vs Estimator and Properties of a Good Estimator
1. Estimate vs Estimator
Estimator:
o A rule, formula, or method used to calculate an unknown parameter from
data.
o It is a random variable because it depends on the sample.
o Example: The sample mean xˉ\bar{x} is an estimator of the population mean
μ.
Estimate:
o The actual numerical value obtained when the estimator is applied to a
specific sample.
o Example: If your sample mean xˉ=52\bar{x} = 52, then 52 is the estimate of μ.
Analogy: Think of an estimator as a recipe, and an estimate as the dish you get when you
follow the recipe with actual ingredients (data).
2. Properties of a Good Estimator
A good estimator should have certain desirable properties. Let’s explore them:
(a) Unbiasedness
Easy2Siksha.com
An estimator is unbiased if its expected value equals the true parameter.
Example: The sample mean is an unbiased estimator of the population mean.
Why important? Because on average, it hits the target.
(b) Consistency
An estimator is consistent if, as the sample size increases, it converges to the true
parameter.
Example: With more and more coin tosses, the sample proportion of heads
approaches the true probability.
Why important? Because with enough data, you can trust the estimator.
(c) Efficiency
Among unbiased estimators, the one with the smallest variance is most efficient.
Example: If two estimators are unbiased, but one fluctuates less across samples, it is
preferred.
Why important? Because it gives more precise estimates.
(d) Sufficiency
An estimator is sufficient if it uses all the information in the sample about the
parameter.
Example: The sample mean is sufficient for the population mean in a normal
distribution.
Why important? Because no information is wasted.
(e) Simplicity (Practicality)
A good estimator should be easy to compute and understand.
Why important? Because overly complex estimators may not be practical in
real-world use.
󼩺󼩻 Story Analogy for Estimators
Imagine you are an archer aiming at a target:
Unbiasedness: Your arrows, on average, hit the bull’s-eye.
Consistency: The more arrows you shoot, the closer they cluster around the
bull’s-eye.
Efficiency: Your arrows are tightly grouped, not scattered.
Sufficiency: You use all the information available about the target’s position.
Simplicity: Your bow is easy to handle, not overly complicated.
A good estimator is like a skilled archeraccurate, precise, reliable, and efficient.
󹶓󹶔󹶕󹶖󹶗󹶘 Conclusion
Easy2Siksha.com
Let’s tie it all together:
Maximum Likelihood Estimation (MLE) is a general method that chooses
parameters making the observed data most probable.
Ordinary Least Squares (OLS) is a specific method that minimizes squared errors,
and in the case of normal errors, it coincides with MLE.
Estimator vs Estimate: An estimator is the rule or formula; an estimate is the actual
number you get from data.
Good Estimators: They should be unbiased, consistent, efficient, sufficient, and
simple.
In the end, both MLE and OLS are like detectives solving mysteries with different strategies,
while estimators are the tools they use. A good estimator is one that consistently points us
toward the truth, just as a good detective consistently uncovers the right culprit.
SECTION-B
3. What do you understand by x² (Chi-square) distribution ? Derive its main properties.
Ans: Understanding the Chi-Square (χ²) Distribution
Imagine you are a curious explorer trying to understand the hidden patterns in a sea of
numbers. Every number seems random, but you know there’s a secret logic behind their
arrangement. This is exactly how statisticians approach data: they want to find patterns,
measure variability, and test hypotheses. One of the most powerful tools in this journey is
the Chi-Square distribution, often denoted by χ² (chi-square).
At its heart, the chi-square distribution is about measuring how far observations deviate
from expected values. But instead of looking at each deviation in isolation, it combines
them into a single statistic, squaring the deviations to ensure both positive and negative
differences contribute equally. This approach gives us a meaningful measure of variation
that is widely used in statistics, from testing independence in tables to estimating variances.
Origin and Concept
Let’s travel back in time for a moment. The chi-square distribution emerged from the work
of Karl Pearson in the early 20th century. He wanted a method to test whether observed
data followed a theoretical pattern. For example, if you flip a fair coin 100 times, you expect
50 heads and 50 tails. But in reality, you might get 47 heads and 53 tails. How do you know
if this deviation is just by chance, or if something is wrong with the coin? Karl Pearson
devised the chi-square test to answer such questions.
Easy2Siksha.com
Mathematically, the χ² distribution arises from the sum of the squares of independent
standard normal random variables. Let’s understand this in steps:
1. Standard Normal Variables: Imagine variables like Z₁, Z₂, ..., Zₙ that follow a standard
normal distribution (mean 0, variance 1). These are your “building blocks” of
randomness.
2. Squaring and Summing: If you square each of these variables and add them
together, you get a new variable:
This new variable, χ², follows the chi-square distribution with n degrees of freedom (df).
The degrees of freedom typically correspond to the number of independent pieces of
information used in the calculation.
So, in essence, χ² is a way of combining squared deviations from the mean in a
standardized manner.
Properties of the Chi-Square Distribution
Now, let’s explore the main properties of the chi-square distribution, one by one, like
chapters in an unfolding story:
1. Distribution Type and Shape
The chi-square distribution is positively skewed (right-skewed) and is defined only for non-
negative values (χ² ≥ 0). This makes sense intuitively: since it is the sum of squares, it can
never be negative.
For small degrees of freedom (df), the distribution is highly skewed.
As the degrees of freedom increase, the distribution becomes more symmetric and
starts to resemble a normal distribution.
2. Degrees of Freedom
The degrees of freedom (df), denoted by ν, are central to the chi-square distribution. They
usually represent the number of independent standard normal variables squared and
summed.
For example, if χ² is based on 5 independent squared standard normals, then df = 5.
In practical applications, degrees of freedom adjust for constraints in data, such as
estimating parameters.
3. Mean and Variance
Easy2Siksha.com
The mean of a χ² distribution is equal to its degrees of freedom:
The variance is twice the degrees of freedom:
This property is particularly elegant because it shows how the spread of the distribution
grows with the number of independent variables.
4. Additivity Property
One of the most beautiful aspects of the chi-square distribution is its additivity:
If you have two independent chi-square variables, χ²₁ with ν₁ df and χ²₂ with ν₂ df,
then their sum is also chi-square distributed with ν₁ + ν₂ df:
This property is extremely useful in combining information from different sources.
5. Moment-Generating Function (MGF)
The chi-square distribution has a moment-generating function (MGF), which allows
statisticians to compute expected values and higher moments easily. The MGF of χ² with ν
df is:
This elegant function helps derive the mean, variance, and other characteristics
mathematically.
6. Skewness and Kurtosis
Skewness: The chi-square distribution is skewed to the right, with skewness = √(8/ν).
As ν increases, skewness decreases.
Kurtosis: The kurtosis measures “peakedness.” For χ² distribution, kurtosis = 12/ν.
Again, as df increases, the distribution flattens out and looks more like a normal
curve.
Easy2Siksha.com
Applications and Intuition
Let’s bring this abstract concept back to reality. Imagine a classroom where a teacher wants
to see if students’ performance matches expected levels. The teacher expects 20 students
to get grade A, 30 grade B, and 50 grade C. After the exam, the actual numbers differ
slightly. By applying a chi-square test, the teacher can measure whether the observed
differences are significant or just due to random chance.
Similarly, the χ² distribution is crucial in:
1. Goodness-of-Fit Tests Checking whether observed data follow a theoretical
distribution.
2. Test of Independence Used in contingency tables to see if two categorical variables
are related.
3. Confidence Interval for Variance In normal populations, χ² helps estimate
population variance from sample variance.
Story Analogy
Think of each standard normal variable as a musician in an orchestra. Individually, each
plays its own tune (random, with mean zero). Squaring each tune amplifies the strength
(removing negative notes) and summing them creates a powerful collective soundthis is
χ². Depending on the number of musicians (degrees of freedom), the symphony can be
simple, skewed, or rich and balanced. And just like a symphony can combine multiple
sections, the chi-square distribution allows us to combine independent sources of
information into a single meaningful measure.
Conclusion
In summary, the chi-square distribution is more than just a formula. It is a story of
randomness, deviation, and collective measurement. By summing squared standard
normal variables, it captures the essence of variation and allows statisticians to test
hypotheses and understand patterns. Its propertiesthe mean, variance, skewness,
additivity, and dependence on degrees of freedommake it a versatile and powerful tool in
statistics.
By visualizing χ² as the sum of squared deviations and understanding its properties, one can
not only apply it in real-world data analysis but also appreciate the underlying elegance of
statistical thinking.
The chi-square distribution reminds us that in the world of numbers, even chaos has a
pattern waiting to be discovered, and with the right tools, we can make sense of it.
Easy2Siksha.com
4. Define student's t-statistic and derive its chief properties.
Ans: Imagine a group of students in a statistics class. They are asked to estimate the average
height of all the students in their university. Of course, they cannot measure everyone, so
they take a small sample. Now comes the challenge: how do they test whether their sample
mean is a reliable estimate of the population mean, especially when the population variance
is unknown?
This is exactly the situation that gave birth to the Student’s t-statistic, introduced by William
Sealy Gosset (who wrote under the pen name “Student”). The t-statistic became one of the
most powerful tools in inferential statistics, especially when dealing with small samples.
Let’s carefully unfold this concept: first by defining the t-statistic, then by deriving its chief
properties, and finally by understanding why it is so important.
󷊆󷊇 Definition of Student’s t-Statistic
Suppose we have a random sample X1,X2,…,Xn drawn from a normal population with mean
μ\mu and variance σ
2
.
The sample mean is:
The sample variance is:
Now, if the population variance σ2\sigma^2 were known, we could use the z-statistic:
which follows the standard normal distribution.
But in real life, σ
2
is usually unknown. So we replace it with the sample variance S2S^2. This
gives us the Student’s t-statistic:
Easy2Siksha.com
This statistic follows a t-distribution with (n 1) degrees of freedom.
󷈷󷈸󷈹󷈺󷈻󷈼 Why the t-Statistic is Needed
When the sample size is large, the sample variance S
2
is a good approximation of σ
2
,
and the z-test works fine.
But when the sample size is small, replacing σ\sigma with SS introduces extra
variability.
The t-distribution accounts for this extra uncertainty, making it more reliable for
small samples.
So, the t-statistic is like a careful student who admits: “I don’t know the population variance,
but I’ll use my sample variance and adjust for the uncertainty.”
󽀼󽀽󽁀󽁁󽀾󽁂󽀿󽁃 Derivation of the t-Distribution
To understand the properties of the t-statistic, let’s see how it is derived.
1. From probability theory, we know that:
2. Also, the sample variance is related to the chi-square distribution:
3. Importantly, Z and S
2
are independent.
4. Combining these, the t-statistic can be written as:
This ratio defines the t-distribution with (n 1) degrees of freedom.
󷇮󷇭 Chief Properties of the Student’s t-Distribution
Easy2Siksha.com
Now let’s explore the main properties of the t-statistic and its distribution.
1. Symmetry
The t-distribution is symmetric about zero, just like the standard normal distribution.
This means positive and negative deviations are equally likely.
2. Mean
For degrees of freedom v>1, the mean of the t-distribution is 0.
This makes sense because the t-statistic is centered around the true mean μ\mu.
3. Variance
For v>2, the variance of the t-distribution is:
Notice that this variance is larger than 1, meaning the t-distribution is more spread
out than the normal distribution.
As v→∞, the variance approaches 1, and the t-distribution converges to the
standard normal distribution.
4. Kurtosis (Tails)
The t-distribution has heavier tails than the normal distribution.
This means extreme values are more likely.
This property makes the t-test more cautious, especially with small samples.
5. Degrees of Freedom (df)
The shape of the t-distribution depends on the degrees of freedom v=n−1.
With small df, the distribution is flatter and wider.
As df increases, it becomes closer to the normal distribution.
6. Independence from Scale
The t-statistic is scale-free: it does not depend on the units of measurement.
Whether you measure height in centimeters or inches, the t-statistic remains the
same.
7. Use in Hypothesis Testing
The t-statistic is widely used in testing hypotheses about population means when
variance is unknown.
Examples: one-sample t-test, two-sample t-test, paired t-test.
Easy2Siksha.com
󼩺󼩻 Story Analogy
Think of the t-distribution as a cautious younger sibling of the normal distribution. The
normal distribution is confident and slim, assuming we know everything about the
population variance. The t-distribution, on the other hand, says: “Wait, we don’t know the
variance exactly, so let’s allow for more uncertainty.” That’s why it has heavier tails—it gives
more room for error.
As the sample size grows, the younger sibling gains confidence and eventually looks just like
the normal distribution.
󹶓󹶔󹶕󹶖󹶗󹶘 Importance in Practice
Small Samples: The t-statistic is essential when working with small datasets, which is
common in psychology, medicine, and social sciences.
Unknown Variance: Since population variance is rarely known, the t-test is more
practical than the z-test.
Robustness: Even when data are not perfectly normal, the t-test often works well,
especially with moderate sample sizes.
󹶜󹶟󹶝󹶞󹶠󹶡󹶢󹶣󹶤󹶥󹶦󹶧 Summary of Chief Properties
1. The t-statistic is defined as:
where it follows a t-distribution with n−1n-1 degrees of freedom.
2. Properties of the t-distribution:
o Symmetric about zero.
o Mean = 0 (for df > 1).
o Variance = v/(v−2)v/(v-2) (for df > 2).
o Heavier tails than the normal distribution.
o Shape depends on degrees of freedom.
o Approaches the normal distribution as df → ∞.
󷘹󷘴󷘵󷘶󷘷󷘸 Conclusion
The Student’s t-statistic is one of the cornerstones of inferential statistics. It was born out of
a practical need: how to make reliable inferences about a population when the sample is
small and the variance is unknown.
Its propertiessymmetry, heavier tails, dependence on degrees of freedom, and
convergence to the normal distributionmake it both cautious and powerful. It reminds us
Easy2Siksha.com
that statistics is not just about formulas but about humility: acknowledging uncertainty and
adjusting for it.
So, the next time you see a t-test result, remember the story of Gosset, the “Student,” and
how his statistic became a trusted companion for researchers across the world.
SECTION-C
5. (a) In a sample of 500 persons, 280 are found to be rice eaters and the rest are wheat
eaters. Can we assume that both the articles are equally popular?
(b) Discuss the main applications of Z distribution.
Ans: Understanding the Popularity of Rice and Wheat through a Story of a Village Survey
Imagine a small town called Grainville, where people live simple lives, and meals revolve
around either rice or wheat. Some residents enjoy steaming hot rice with vegetables, while
others swear by golden wheat rotis with their curries. One day, the town council decides to
find out which staple is more popularrice or wheatbecause they want to plan grain
supplies for the next season.
To get a clear picture, the council conducts a survey of 500 people. After careful counting,
they find something interesting:
280 people eat rice
220 people eat wheat
Now, they have a question: “Can we say that rice and wheat are equally popular?” To
answer this, we have to move beyond simply looking at the numbers and bring in the world
of statistics, which helps us turn observations into meaningful conclusions.
Part (a): Can We Assume Rice and Wheat Are Equally Popular?
At first glance, someone might say, “280 is not equal to 220, so rice is more popular.” But
statistics teaches us that numbers alone don’t tell the full story. We must ask: Is this
difference due to random chance, or does it reflect a real preference among the people?
This is where the Z test for proportions comes into play. The Z test is a method in statistics
that lets us check whether a sample proportion is significantly different from an expected
proportion.
Step 1: Define Hypotheses
Easy2Siksha.com
In statistical terms, we first form two hypotheses:
Null hypothesis (H₀): Rice and wheat are equally popular.
Alternative hypothesis (H₁): Rice and wheat are not equally popular.
Mathematically, if both grains are equally popular, each should be eaten by 50% of the
people. That means:
Expected rice eaters = 500 × 0.5 = 250
Expected wheat eaters = 500 × 0.5 = 250
Step 2: Observe the Difference
From the survey:
Actual rice eaters = 280
Actual wheat eaters = 220
The difference between observed and expected rice eaters = 280 250 = 30
Now, we need to see if this difference of 30 is large enough to say rice is truly more popular,
or if it could have happened by chance.
Step 3: Apply the Z Formula for Proportions
The formula for Z in testing proportions is:
Where:
p = sample proportion (observed proportion)
P = population proportion (expected proportion)
n = sample size
Let’s calculate:
Step 4: Interpret the Z Value
Easy2Siksha.com
In statistics, we often compare the Z value to critical values at a 5% significance level:
Two-tailed test: critical Z ≈ ±1.96
Since 2.68 > 1.96, we reject the null hypothesis.
Conclusion: The difference in popularity is statistically significant. This means rice is slightly
more popular than wheat among the people of Grainville, and this difference is unlikely due
to random chance.
Part (b): Applications of Z Distribution
Now that we’ve seen how the Z test helps us check the popularity of rice and wheat, let’s
take a step back and talk about the Z distribution, which makes this analysis possible.
Imagine it as a powerful tool in a statistician’s toolkit, ready to solve many real-world
problems.
The Z distribution, also called the standard normal distribution, is a bell-shaped curve
centered at zero with a standard deviation of one. Every Z value tells us how far a data point
is from the average, measured in standard deviations.
Think of it as a ruler: one end shows unusually low values, the middle shows typical values,
and the other end shows unusually high values.
Main Applications of Z Distribution
1. Testing Hypotheses about Means and Proportions
We just saw an example with rice and wheat.
Z tests can determine if a sample mean or proportion significantly differs from a
population mean or proportion.
Example: Testing whether students in a school prefer online classes over offline
classes.
2. Finding Probabilities
The Z distribution is useful in probability calculations.
It tells us the likelihood that a particular observation falls above, below, or between
specific values.
Example: What is the probability that a randomly selected student scores above 80%
on a test?
3. Standardizing Scores
Easy2Siksha.com
Z scores allow comparison of different sets of data on the same scale.
For instance, if two students take different exams, their Z scores can tell who
performed better relative to their classmates.
This is especially helpful in sports, academics, and performance evaluations.
4. Quality Control in Industries
Industries use Z distribution in control charts to monitor product quality.
Example: Checking whether the diameter of machine-produced bolts stays within
acceptable limits.
If a Z score is too high or too low, it signals a problem in production.
5. Research and Social Sciences
Researchers use Z distribution to analyze survey results, test new policies, and check
the effectiveness of interventions.
Example: In medicine, testing whether a new drug has a significant effect compared
to a placebo.
6. Finance and Risk Management
Z distribution helps banks and financial analysts calculate probabilities of extreme
events, like stock market crashes or unusual returns.
Example: Determining the probability of a loan default exceeding a critical level.
7. Comparing Two Groups
In studies comparing two populations, Z tests can be used to check differences in
means or proportions.
Example: Comparing the average height of boys and girls in a school, or comparing
election preferences between regions.
Connecting Part (a) and Part (b)
The rice and wheat example becomes clear when we connect it to the broader applications
of Z distribution:
The observed difference in rice and wheat eaters is measured using a Z test for
proportions.
The Z value tells us whether the observed difference is meaningful or just due to
chance.
This same principle can be applied to almost any comparison problem: whether
comparing test scores, market shares, or patient outcomes.
Easy2Siksha.com
In other words, Z distribution is like a universal translator. No matter what fieldeducation,
agriculture, medicine, or financeit helps us interpret numbers in a probabilistic,
standardized way.
A Story Within the Numbers
Returning to Grainville, imagine the town council now holding a meeting to decide which
grain to stock in larger quantities. They don’t just rely on intuition. They use the Z test to
understand the statistical truth.
They find that rice has a clear edge.
Using this insight, they plan rice supplies accordingly, ensuring that most people are
happy.
Meanwhile, the council also learns about Z distribution, realizing that the same approach
can help in planning school lunches, hospital diets, or even election campaigns.
The Z distribution becomes a hero in their towna tool that turns raw numbers into
actionable decisions.
Conclusion
In summary:
1. Rice vs. Wheat Popularity: In a sample of 500, rice eaters (280) were significantly
more than wheat eaters (220). Using the Z test for proportions, we rejected the idea
that both grains were equally popular. Statistics allowed the town council to make a
fact-based decision rather than relying on guesswork.
2. Z Distribution Applications: The Z distribution is a versatile tool in statistics. It is used
for hypothesis testing, probability calculations, quality control, standardization of
scores, comparing groups, and research analysis across different fields.
By looking at numbers through the lens of Z distribution, we can see beyond the obvious,
detect meaningful differences, and make smarter decisions. Just like in Grainville, whether
it’s rice or wheat, or any other real-world problem, Z distribution helps turn data into
knowledge and action.
Easy2Siksha.com
6. In a laboratory experiment, two samples gave the following results:
Sample
Size
Simple Mean
Sum of squares of
deviations from
mean
1
10
15
90
2
12
14
108
Test the equality of sample variances at 5% level of significance.
Ans: Problem setup and intuition
You’re given:
Sample 1:
o Size: n1=10
o Sample mean: xˉ1=15
o Sum of squares of deviations from mean: SS1=90
Sample 2:
o Size: n2=12
o Sample mean: xˉ2=14
o Sum of squares of deviations from mean: SS2=108
You’re asked to test the equality of population variances at the 5% level of significance.
That’s a classic two-sample variance comparison using the F-test, with a two-sided
alternative.
Hypotheses and test choice
Because we’re comparing variances from two independent normal samples, the natural test
statistic is the F-ratio, formed as the ratio of the sample variances. To keep the test one-
sided in the upper tail (for convenience), we place the larger sample variance in the
numerator.
Compute sample variances from given sums of squares
The unbiased sample variance for each sample is:
Easy2Siksha.com
For Sample 1:
For Sample 2:
The degrees of freedom for the numerator and denominator are:
v1=n1−1=9,v2=n2−1=11
Decision rule and critical values
While exact critical values require tables or software, we can rely on standard ranges:
Our computed statistic is:
F≈1.0185
This is far smaller than either of those critical values, so it clearly does not enter the
rejection region.
Easy2Siksha.com
P-value perspective (intuition)
Conclusion and interpretation
Direct answer: We do not reject the null hypothesis at the 5% level.
Interpretation: There is no statistically significant evidence of a difference between
the two population variances. The observed sample variances (1010 and 9.829.82)
are extremely close, and the F-statistic reflects that closeness.
In short: based on this experiment, the variability of the two samples appears statistically
equal.
Step-by-step recap for exam clarity
Final statement: At the 5% level, the sample variances can be treated as equal.
Assumptions and good practice notes
To make the F-test valid, certain conditions should be respected:
Independence:
o The two samples must be independent of each other.
Normality:
o Each population should be approximately normally distributed. The F-test for
equality of variance is sensitive to non-normality; with skewed distributions,
Levene’s or Brown–Forsythe tests are often preferred. For many lab settings,
normality is a reasonable working assumption.
Easy2Siksha.com
Measurement quality:
o Ensure the given sums of squares are computed as deviations from the
respective sample means; you did this correctly when turning SS into S2
Context matters:
o Even when the statistical test suggests equality of variance, practical
considerations (instrument precision, experimental setup) may still influence
analysis choices, such as whether to pool variances in subsequent t-tests.
Gentle narrative analogy
Think of variance like the “scatter” of tiny beads on two trays. If one tray’s beads sit slightly
farther from the center than the other’s, you might wonder: is that small difference real, or
just due to chance? The F-test holds up a measuring frame, compares the spreads, and says,
“This difference is too small to call it real at the 5% level.” In our case, the spreads are
practically twins—close enough that the test says, “Treat them as equals.”
What this enables next
Since the equality of variances is not rejected:
If your next step is comparing means, you can proceed with a pooled-variance t-test
(common in two-sample mean comparisons under equal variance assumptions).
If you’re designing future experiments, this result suggests consistency in
experimental variability across conditions, which is reassuring for pooling
information and reducing noise.
Final verdict
Interpretation: There is no significant evidence that the variances differ; treat them
as equal for subsequent analysis.
SECTION-D
7. A trucking company wishes to test the average life of each of the four brands of tyres.
The company uses all brands on randomly selected trucks. The records showing the lives
(thousands of miles) is given in the following table:
Brand 1
Brand 2
Brand 3
Brand 4
Easy2Siksha.com
20
19
21
15
23
15
19
17
18
17
20
16
17
20
17
18
16
16
Test the hypothesis that average life of each brand of tyres is same.
Ans: Testing the Equality of Average Life of Tyres: A Story of Four Brands
Imagine you are the owner of a trucking company, responsible for a fleet of trucks that
crisscross the country. Your business thrives on efficiency, and every delay or breakdown
costs both money and reputation. One of the most crucial decisions you face every few
years is: which brand of tyres should I use on my trucks? After all, tyres are the lifeline of
your vehicles. A tyre that wears out too soon not only affects safety but also increases
maintenance costs.
Now, you are presented with a choice among four brands of tyres. Each brand claims
superior durability, but marketing promises don’t always match reality. To make an
informed decision, you decide to test the average life of each tyre brand. You install all four
brands on randomly selected trucks and carefully record their lives in thousands of miles.
After months of monitoring, your data looks like this:
Brand 1
Brand 2
Brand 3
Brand 4
20
19
21
15
23
15
19
17
18
17
20
16
17
20
17
18
16
16
The question now is crucial: Are these brands genuinely different in average life, or are the
differences we see just due to random chance?
To answer this, we turn to a powerful statistical tool called Analysis of Variance (ANOVA).
ANOVA allows us to compare the means of more than two groups to see if there is a
significant difference among them. In this case, our four groups are the four tyre brands.
Step 1: Understanding the Hypotheses
The first step in any statistical test is to clearly state the hypotheses:
Easy2Siksha.com
Null Hypothesis (H₀): The average life of all four brands of tyres is the same.
Mathematically,
Alternative Hypothesis (H₁): At least one brand has an average life different from the
others.
Think of it as a courtroom trial. The null hypothesis is innocent until proven guilty. We
assume all brands are equal unless the data gives us strong evidence otherwise.
Step 2: Organizing the Data
To simplify our calculation, let’s write down the data for each brand clearly:
Brand 1: 20, 23, 18, 17
Brand 2: 19, 15, 17, 20, 16
Brand 3: 21, 19, 20, 17, 16
Brand 4: 15, 17, 16, 18
Notice that not all brands have the same number of observations. This is called an
unbalanced design, but for simplicity, we can still apply ANOVA using the standard
formulas.
Step 3: Calculating the Means
The next step is to calculate the mean (average life) of tyres for each brand. Let’s calculate
them carefully:
1. Brand 1 Mean:
2. Brand 2 Mean:
3. Brand 3 Mean:
Easy2Siksha.com
4. Brand 4 Mean:
Finally, we calculate the grand mean, the average of all observations:
The grand mean represents the overall average life of tyres, ignoring the brand distinction.
Step 4: Partitioning the Variability
ANOVA works by comparing two sources of variability:
1. Between-group variability (SSB): How much the brand means differ from the grand
mean.
2. Within-group variability (SSW): How much individual tyre lives differ from their own
brand mean.
Step 4a: Between-group Sum of Squares (SSB)
The formula is:
Where:
Calculating step by step:
Easy2Siksha.com
Step 4b: Within-group Sum of Squares (SSW)
The formula is:
Step 5: Calculating the F-Statistic
Now, we compute the mean squares:
1. Degrees of freedom:
2. Mean Squares:
Easy2Siksha.com
Finally, the F-statistic is:
Step 6: Comparing F with Critical Value
Our calculated F = 1.67
Critical F = 3.34
Since 1.67 < 3.34, we fail to reject the null hypothesis.
Step 7: Interpretation in Simple Words
This means that the differences in the average life of the four tyre brands are not
statistically significant. In other words, the variation we observe among brands could easily
be due to random chance rather than any real difference in performance.
As a trucking company owner, this suggests that any of the four brands could be chosen, at
least from the perspective of average life. Other factors such as cost, availability, and after-
sales service might now guide your decision.
Step 8: Making the Story Human
Think about it this way: You’ve set out on a mission to pick the “best” tyres, and after
careful data collection and analysis, you realize that nature (or chance) plays tricks on
averages. The slight differences you noticedBrand 1 lasting 19.5k miles and Brand 4 lasting
16.5k mileslook dramatic at first glance. But statistics tells you that these differences are
within the range of natural variability.
It’s like tasting four cakes baked by different chefs. One seems sweeter than the others, but
a taste test with a dozen friends shows everyone has a slightly different perception.
Statistically, there is no clear winner.
Easy2Siksha.com
Step 9: Lessons from ANOVA
1. Data tells the truth, but context matters. Even if one brand seems “better” in a
small sample, the overall variability can hide real differences.
2. Statistical tests prevent hasty decisions. Without ANOVA, you might have blindly
chosen Brand 1 and ignored Brand 4.
3. Future research can refine decisions. Increasing the sample size or testing tyres
under different road conditions may reveal differences not visible in this small
sample.
Conclusion
In summary, testing the equality of tyre life using ANOVA is like being a careful detective.
You observed, collected evidence, analyzed variations, and finally let the numbers speak.
The story of these four tyre brands teaches a timeless lesson: sometimes, the best decision
is not about choosing the “best,” but understanding that all options are comparable within
statistical limits.
In our case, the ANOVA test showed that the average life of all four brands is statistically
the same. So, the next time you hit the highway, know that your tyres’ fate is not about
brands alone—it’s also about careful driving, road conditions, and maintenance.
8. What is analysis of variance technique? Discuss its main assumptions. Also distinguish
between one way and two way ANOVA techniques
Ans: Analysis of variance technique
ANOVA is a statistical method used to test whether three or more group means are
significantly different. Instead of comparing means pairwise (which inflates error), ANOVA
partitions the total variation in the data into components due to different sources, then
evaluates whether the variation between groups is large relative to the variation within
groups.
Core idea: Compare “between-group” variance to “within-group” variance using an
F-ratio:
If F is large, the groups likely have different means.
Easy2Siksha.com
What it answers: “Do group means differ?” It does not tell which pairs differ; for
that, we use post-hoc tests (Tukey, Bonferroni).
Where it’s used: Agriculture (yield across fertilizers), education (scores across
teaching methods), psychology (response across treatments), industry (process
lines).
Main assumptions of ANOVA
To trust the F-test, certain conditions should hold:
Independence: Observations are independent across and within groups. Violations
(e.g., repeated measures without accounting) bias results.
Normality: For each group, the outcome variable is approximately normally
distributed. ANOVA is robust to mild departures, especially with equal sample sizes.
Homogeneity of variances: Group variances are roughly equal (homoscedasticity). If
not, use Welch’s ANOVA or transform data.
Additivity and linearity: The model assumes effects add without strange nonlinear
interactions (in one-way models). Two-way ANOVA explicitly models interaction.
Fixed-effects (in classical ANOVA): Factor levels are specific and of interest; if levels
are random, use random/mixed-effects ANOVA.
Random sampling: Samples from populations are random and representative.
One-way vs two-way ANOVA
Think of factors as reasons why data vary. One-way ANOVA has one reason; two-way has
two reasons and asks if they interact.
One-way ANOVA
Design: One factor (e.g., teaching method with k levels).
Model:
Sources of variation:
o Between groups (factor)
o Within groups (error)
Use when: You have one categorical independent variable (factor) and a continuous
dependent variable.
Question answered: Do mean outcomes differ across the k levels?
Two-way ANOVA
Easy2Siksha.com
Design: Two factors (e.g., teaching method and class size), each with multiple levels;
may include interaction.
Model:
Sources of variation:
o Factor A main effect
o Factor B main effect
o Interaction effect
o Error
Use when: You want to examine two categorical predictors and whether the effect
of one depends on the other.
Question answered: Do means differ across Factor A? Factor B? Do the factors
interact?
Key distinctions
Number of factors: One-way has 1; two-way has 2 (plus interaction).
Interpretation: One-waydifferences across groups. Two-wayseparate main
effects plus whether effects combine non-additively.
Efficiency: Two-way ANOVA can control for a second source of variation and
improve precision.
Design requirement: Two-way often requires balanced or near-balanced designs for
clean interpretation.
Concept diagram: ANOVA flow
Practical intuition
Easy2Siksha.com
Between-group variance: Captures how far group means are spread from the overall
mean.
Within-group variance: Captures how scattered individual observations are around
their group mean.
If groups truly differ: Between-group variance should be meaningfully larger than
within-group variance.
Worked test: Equality of sample variances at 5% (F-test)
You’re given:
Step 1: Compute sample variances
Unbiased sample variance:
Easy2Siksha.com
Step 3: Decision rule (two-sided)
Step 4: Conclusion
Decision: Do not reject H0H_0.
Interpretation: At the 5% level, there is no evidence that the population variances
differ. The sample variances (10 and 9.82) are almost identical; the F-ratio confirms
equality of variances.
When to prefer alternative variance tests
If normality is doubtful: Consider Levene’s test or Brown–Forsythe test (more robust
to non-normality).
If sample sizes differ greatly: Robust tests or transformations may be preferable.
How ANOVA uses variance equality
In classical one-way ANOVA, the assumption of equal variances across groups underpins
pooling the within-group variance into a single MSW. The preceding F-test is precisely the
kind of pre-check often done when comparing two groups (e.g., deciding between pooled t-
test or Welch’s t-test).
A quick study guide summary
Easy2Siksha.com
ANOVA purpose: Test if multiple group means differ by comparing between-group
to within-group variance via an F-statistic.
Assumptions: Independence, normality, homogeneity of variances, additivity, fixed
effects, random sampling.
One-way vs two-way:
o One-way: single factor, sources = between + within.
o Two-way: two factors + interaction, sources = A + B + A×B + error.
Variance equality test (lab example):
o Compute variances from sums of squares.
o Form F with larger variance on top.
o Compare to critical values; here, F ≈ 1.0185 → equal variances.
Closing analogy
Back to our music hall: ANOVA is the judge who listens not just to the loud notes but to the
pattern of variation within each group and across groups, deciding whether differences in
melodies are genuine or merely noise. And before comparing melodies, we check that the
instruments (variances) are tuned similarlyyour F-test did exactly that; both instruments
are equally tuned, so you can fairly compare the music next.
“This paper has been carefully prepared for educational purposes. If you notice any mistakes or
have suggestions, feel free to share your feedback.”